Skip to content

Simplify Gramian normalization#302

Merged
ValerianRey merged 10 commits intomainfrom
change-gramian-normalization-to-frobenius-norm
Apr 10, 2025
Merged

Simplify Gramian normalization#302
ValerianRey merged 10 commits intomainfrom
change-gramian-normalization-to-frobenius-norm

Conversation

@PierreQuinton
Copy link
Copy Markdown
Contributor

@PierreQuinton PierreQuinton commented Apr 6, 2025

  • Use the Frobenius norm instead of the spectral norm
  • Replace compute_normalized_gramian by normalize since the normalization is now composable with the gramian computation
  • Remove compute_regularized_normalized_gramian (in favor of composition of simpler functions)
  • Add test_scale_invariant
  • Make gramian_utils functions public to their package
  • Update the changelog entry relative to the projection changes of UPGrad and DualProj
  • Add changelog entry relative to the normalization changes of UPGrad, DualProj and CAGrad

Now uses the Frobenius norm instead of the spectral norm.

The advantage is that it unifies the way we compute Gramians, and is therefore a safe step in the direction of autogram.

TODO:

  • Fix conflict
  • Add changelog entry
  • Test this in the trajectories repo, with various values of the norm_eps param
  • Maybe change the default norm_eps of UPGrad and DualProj
  • Maybe run a jde training to double check that this works when doing deep learning
  • Remove the breaking-change label if it seems fine
  • Run test_scale_invariant on GPU
  • Improve the parametrization of test_scale_invariant
  • Try to reduce the atol of LUS property => Not possible, still due to UPGrad. This is comforting in the sense that this seems very similar to what we had before.

TODO after merging:

  • Add UPGrad to LibMTL since this normalization will be ok with earlier versions of torch

@PierreQuinton PierreQuinton added package: aggregation cc: refactor Conventional commit type for any refactoring, not user-facing, and not typing or perf improvements labels Apr 6, 2025
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Files with missing lines Coverage Δ
src/torchjd/aggregation/_gramian_utils.py 100.00% <100.00%> (ø)
src/torchjd/aggregation/cagrad.py 100.00% <100.00%> (ø)
src/torchjd/aggregation/dualproj.py 100.00% <100.00%> (ø)
src/torchjd/aggregation/mgda.py 100.00% <100.00%> (ø)
src/torchjd/aggregation/upgrad.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ValerianRey ValerianRey added the breaking-change This PR introduces a breaking change. label Apr 6, 2025
@ValerianRey
Copy link
Copy Markdown
Contributor

ValerianRey commented Apr 6, 2025

How can we test that this is not destroying the performances of UPGrad? Do you think unit tests are sufficient for this? I think we should also use the trajectories project to verify empirically that the trajectories still look good (that will depend on the value of norm_eps of course). Maybe we should even run one of the experiments of jde to verify that we can still obtain roughly similar performance.

I think we would also have to change the default value of norm_eps in UPGrad and DualProj.

Lastly, this clearly deserves a changelog entry. Arguably, this is not a breaking change so we can remove this label if you think it's not so different.

@ValerianRey ValerianRey moved this to In Progress in Aggregation Apr 6, 2025
@ValerianRey ValerianRey moved this to In Progress in Autogram Apr 6, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in Aggregation Apr 6, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in Autogram Apr 6, 2025
@PierreQuinton PierreQuinton reopened this Apr 6, 2025
@ValerianRey ValerianRey moved this from Done to In Progress in Autogram Apr 6, 2025
@ValerianRey ValerianRey moved this from Done to In Progress in Aggregation Apr 6, 2025
@PierreQuinton
Copy link
Copy Markdown
Contributor Author

As a partial answer to your question, I added a test that proves that project_weights is scaling invariant on the Gramian. So the normalization of the Gramian does not change anything in the normal regime (not too high, not too low).

@PierreQuinton
Copy link
Copy Markdown
Contributor Author

PierreQuinton commented Apr 8, 2025

Additionally, the scaling factor that we had versus what we have have a ratio of at most $\sqrt{m}$. This is because we use to scale by the largest singular value which is the square root of the maximal eigen value of the Gramian. Now we scale by the square root of the trace of the Gramian. The trace of the gramian is the sum of the eigen values and we have $\sigma_{\max}(G)\leq \text{Tr}(G)\leq m \sigma_{\max} (G)$. Taking the square root we get that our new normalization is between the largest singular value and $\sqrt{m}$ times this same singular value.

@ValerianRey
Copy link
Copy Markdown
Contributor

Additionally, the scaling factor that we had versus what we have have a ratio of at most m . This is because we use to scale by the largest singular value which is the square root of the maximal eigen value of the Gramian. Now we scale by the square root of the trace of the Gramian. The trace of the gramian is the sum of the eigen values and we have σ max ( G ) ≤ Tr ( G ) ≤ m σ max ( G ) . Taking the square root we get that our new normalization is between the largest singular value and m times this same singular value.

Very good to know that thanks! So assuming m=100 (which is kind of the maximum realistic value for m that users would have), the difference would be at max of a factor 10. I think this is fine, because from my past experience the norm_eps was not so sensitive and changing it by a factor 10 should not affect the results that much.

@ValerianRey
Copy link
Copy Markdown
Contributor

Apart from my comments and todos, LGTM.

@PierreQuinton PierreQuinton removed the breaking-change This PR introduces a breaking change. label Apr 10, 2025
@PierreQuinton PierreQuinton force-pushed the change-gramian-normalization-to-frobenius-norm branch from ac8ddd3 to ee06659 Compare April 10, 2025 06:47
@PierreQuinton
Copy link
Copy Markdown
Contributor Author

For me, we could have another PR looking at norm_eps and reg_eps, I do not know (yet) how to assess the correct values for those, and I need to think about it a lot more, the current values are fine for now, we will do JDE and trajectories at that moment, this is out of scope. For the atol of the LUS, this will be related to a refactor of the property testers and is also out of scope for this PR.
So for me, this is now ready to merge.

@ValerianRey
Copy link
Copy Markdown
Contributor

Waiting for Windows tests, and then we can merge!

@ValerianRey ValerianRey changed the title Change the normalization of Gramian Simplify Gramian normalization Apr 10, 2025
@ValerianRey ValerianRey merged commit f0d3eac into main Apr 10, 2025
15 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Autogram Apr 10, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in Aggregation Apr 10, 2025
@ValerianRey ValerianRey deleted the change-gramian-normalization-to-frobenius-norm branch April 10, 2025 11:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cc: refactor Conventional commit type for any refactoring, not user-facing, and not typing or perf improvements package: aggregation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants